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Abstract 

Motivated by a study about prompt coronary angiography in myocardial infarction, 
we propose a method to estimate the causal effect of a treatment in two-arm experimental 
studies with possible non-compliance in both treatment and control arms. The method is 
based on a causal model for repeated binary outcomes (before and after the treatment), 
which includes individual covariates and latent variables for the unobserved heterogeneity 
between subjects. Moreover, given the type of non-compliance, the model assumes the 
existence of three subpopulations of subjects: compilers, never-takers, and always-takers. 
The model is estimated by a two-step estimator: at the first step the probability that a 
subject belongs to one of the three subpopulations is estimated on the basis of the avail- 
able covariates; at the second step the causal effects are estimated through a conditional 
logistic method, the implementation of which depends on the results from the first step. 
Standard errors for this estimator are computed on the basis of a sandwich formula. The 
application shows that prompt coronary angiography in patients with myocardial infarc- 
tion may significantly decrease the risk of other events within the next two years, with a 
log-odds of about -2. Given that non-compliance is significant for patients being given the 
treatment because of high risk conditions, classical estimators fail to detect, or at least 
underestimate, this effect. 
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1 Introduction 



It is well known that, even in experimental studies, non-compliance is a strong source of con- 
founding in the estimation of the causal effect of a treatment, in particular when measured 
and/or unmeasured factors affect both the decision to comply and the reaction to the treat- 
ment. There are basically three approaches to causal inference i n these circumstanc e s. These 
are based on: (i) potential outcomes o r counterfactu als (e.g. 



Rubin 



1974 



1978 



1986; 



Angrist et al.l . 



1996; 



Abadid . 



2003; 



Rubinj 



Holland! . 



2005), (ii) margin al structural models and 



inverse probability estimators for the s e mo d els (jRobinsl . 



graphs (DAGs) formalized by 



Pearl ( 



studies with all-or-nothing compliance^!, 



1995, 



1989 



19941 ). or (ra) directed acyclic 



20091). In pa rticular, for two-arm experimental 



Bartoluccil (120101 ) developed a method that may be 



applied with repeated binary outcomes and is based on an modified version of the condi- 



tional logistic estimator ( Breslow and Day 



Hosmer and Lemeshow 



200ol ). 



1980 



Collett 



1991 



Rothaman and Greenland! . 



1998 



This method is based on a DAG model with latent variables, 



the parameters of which have a causal interpretation. The same model may be formulated on 
the basis of pote ntial outcomes. The estimator is simple to apply, but in the formulation of 



Bartoluccil (120101 ) it may be appli ed when non-complia nce is only in the treatment arm and 



therefore, using the terminology of 



Angrist et al 



(119961 ). there are only compilers (who always 



comply with the treatment) and never-takers (who never take the treatment regardless of the 
assigned arm). 

Motivated by an original application about the effectiveness of coronary angiography (CA) in 
p atients with non-S T elevation acute coronary syndrome, in this paper we extend the approach 



of 



Bavtohic-c-j (201(3) by considering cases in which non-compliance may be also observed in the 



control arm. Therefore, there are three subpopulations: compilers, never-takers, and always- 
takers (who always take the treatmen t regardless of the assigned arm). In particular, we extend 



the causal model of 



Bartoluccil (120101 ) and, basically following the same inferential approach, we 



develop a conditional likelihood estimator of the causal effects. The latter may be simply applied. 
It is worth noting that these causal effects are measured on the logit scale, given that we are 



1 all-or-nothing compliance means that the treatment may be taken or not, ruling o ut partial compliance; for 
an approach specifically tailored to partial compliance see iBartolucci and Grillil (|201lh 
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deali ng with binary outcomes; the same scale is used in relevant approaches to causal in 



(e.g., 



Ten Have et al. 



2003 



Vansteelandt and Goetghebeurl. 



Vansteelandt and Goetghebeur 



2003 



20071 ). Moreover, as in lBartoluccil (120101 ). the adopted estimator 



Robins and Rotnitzky 



erence 



2004 : 



is based on two steps. At the first step we estimate the probability that a subject is a compiler, 
a never-taker, or an always-taker on the basis of observable covariates for this subject. At the 
second step, the conditional likelihood of a logistic model, based on a suitable design matrix, 
which is set up by using the results from the first step, is maximized by a simple Newton- Raphson 
algor ithm. Given the two-step formulation of the estimator, we use a sandwich formula (IWhitd . 



19821 ) for deriving standard errors. These may be used to test the significance of the causal 



parameters. 

As mentioned above, we develop our methodology in connection with an original study on 
CA in patients with non-ST elevation acute coronary syndrome. In particular, we are inter- 
ested in investigating whether a prompt CA (within 48h from hospital admission) should be 
recommended in light of a lower risk of recurrent cardiovascular events after leaving the hospi- 
tal. A prompt CA, together with ECG and other exams performed on patients with coronary 
syndrome, may be helpful in better calibrating an in-hospital treatment. Even if the current 
guidelines of th e Euro pean cardiologic society recommend CA within 48h of hospitalization 



(IBertrand et al 



20021 ) . in some hospitals patients are submitted to CA only after a few days, 



or even not at all. In the cardiology literature a definite recommendation has not yet emerged, 



with some studies reporting equ i valence of CA per 



flTIMI HI B Investigators 



1994 



Boden et al. 



1998 



ormed before or after 48h of hospitalization 



Mc Cullough et al 



199 8J), and other stud- 



ies reporting superiority o 



1999 



Cannon et al 



2001 



jrompt CA (IRagmin and Fast Revascularization during Instability in Coronary a] 



Fox et al. 



20021 ). In our data, the medium/long term effects of coro- 



nary angiography within 48h from hospital admission have been estimated using a control given 
by the usual clinical practice in the hospital, which may or may not include the coronary an- 
giography; when included, the designed study planned to schedule it only after at least 48h 
from hospitalization. Then, subjects assigned to the treatment group are expected to undergo 
CA within 48h from hospitalization, whereas patients assigned to the control group may or 
may not undergo CA. When a patient in the control group is submitted to CA, the analysis 
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is expected to be executed after 48h from the hospitalization. Patients were randomized im- 
mediately at hospitalization. In practice, a significant fraction of controls received CA within 
48h from hospitalization, possibly due to the need of information in order to promptly proceed 
with a treatment. Furthermore, a significant fraction of patients in the active group (treat- 
ment arm) did receive CA, but after 48h from hospitalization, possibly due to a busy hospital 
schedule which did not allow prompt CA performance. We consequently have a significant non- 
compliance in both arms, leading to the presence of never-takers and always-takers in addition 
to compilers. Note that non-compliance in this example is more likely a choice of the doctor, 
rather than of the patient. 

We focus on a relevant group of patients, those arriving at the hospital with myocardial 
infarction. From our analyses, based on the causal inference approach here proposed, two 
important findings emerge. First of all, there is a significant causal effect of prompt CA, with a 
log odds-ratio of about -2 and p-value equal to 0.009. Hence, patients arriving at the hospital 
with myocardial infarction should be submitted for coronary angyography within 48h, and this 
will help doctors in greatly decreasing the risk of recurrent events after dismissal. Secondly, 
we estimate the effects separately on the four groups (never-takers receiving control, compilers 
receiving control, compilers receiving treatment, and always-takers receiving treatment), and 
we observe that the bias is arising mostly from the always-takers. In fact, the treatment has 
substantially no effect on the always-takers, but we estimate a strong effect on compilers. 

The paper is organized as follows. In Section [2] we briefly describe the data from the study 
motivating this paper. In Section |3] we introduce the causal model for repeated binary response 
variables. The proposed two-step estimator is described in Section @] and its application to 
the dataset deriving from the cardiology study outlined above is described in Section [5j Final 
conclusions are reported in Section [61 

We implemented the estimator in an R function that we make available to the reader upon 
request. 
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2 Description of the Prompt Coronary Angiography data 



The multicenter trial we consider is based on the inclusion of patients arriving to the hospital 
with last episode of angina pectoris within the last 24 hours. The patients were included in the 
study if they were diagnosed a myocardial infarction. Patients with persistent ST elevation or 
who could not undergo CA were excluded from the study. 

The binary response of interest is the recurrence within 2 years after leaving the hospital of 
any among: (i) another episode of myocardial infarction, (ii) an episode of angina pectoris of 
duration 20 minutes or longer, (Hi) other significant cardiovascular events, or (iv) death which 
could be related to the current episode. The recorded data concern the presence or absence 
of episodes of angina pectoris, myocardial infarction or other cardiovascular events within the 
last month before hospitalization, and other covariates. The first can be considered as a pre- 
treatment copy of the outcome, which we denote with Y\. Among the covariates there are: 
gender, age, smoke, statin use, history of CHD in the family, hypertension, and glicemic index 
(GI) at hospitalization. We are interested in investigating the effect of a prompt CA since our 
population of patients with myocardial infarction (IMA) at hospitalization could probably be 
better treated after CA, and this could prevent further events. 

Overall, we have data on n = 1,560 subjects, whose characteristics are summarized as 
follows: there are 63% males, 46% smokers, 75% have a history of CHD in the family, 31% 
have hypertension, and 81% use statines regularly. GI has a strongly skewed distribution, with 
median equal to 118 and MAD equal to 34; moreover, the mean age is 67.5 with a standard 
deviation of 10.8. 

Randomization was performed with a proportion of 1:2, and in fact 66% of the patients are 
assigned to the prompt CA group. Only 52% of the patients actually were submitted to prompt 
CA. There was non-compliance in both groups, with more than 1/3 of the subjects assigned to 
each group ending up taking the other treatment. More precisely, 370 subjects assigned to the 
prompt CA did undergo CA later than 48h after hospitalization, and 170 patients assigned to 
the control group had prompt CA. 

Given that after model selection we will conclude that GI and use of statines are predictive 
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of compliance (see Section we study these two variables a bit more in depth here. For this 
aim, in Table [I] we report the proportion of patients belonging to the groups of not treated 
as assigned (assigned and received control), always-takers (assigned to control and received 
treatment), never-takers (assigned to treatment and received control), or treated as assigned 
(assigned and received treatment), given the level of GI and the use or not of Statines. The level 
of GI is discretized on the basis of the quartiles of the empirical distribution. It is important to 
underline that the first and last groups are made of both compilers and subjects who were by 
chance assigned to the treatment they would have preferred anyway. That is, in the first group 
we have both compilers assigned to the control and never-takers randomized to the control; in 
the last group we have both compilers assigned to the treatment and always-takers who were 
also randomized to the treatment. 









GI quartile 




Use of statines 


Arm 


Group 


1st 


2nd 3rd 


4th 


No Yes 


Control 


Compilers + never-takers 


0.634 


0.702 0.674 


0.638 


0.606 0.676 




Always-takers 


0.366 


0.298 0.326 


0.362 


0.394 0.324 


Treatment 


Never-takers 


0.336 


0.335 0.389 


0.430 


0.443 0.352 




Compilers + always-takers 


0.664 


0.665 0.611 


0.570 


0.557 0.648 



Table 1: Conditional proportion of the group of not treated as assigned (compliers + never- 
takers), never-takers, always-takers, or treated as assigned (compliers + alwyas-takers), given 
GI and the use or not of statines 

From the results in table Table [1] it can be seen that the proportion of never-takers steadily 
increases with GI, whereas the proportion of always-takers is larger for the first and last quartiles. 
On the other hand, the use of statines seems to increase the compliance in both directions, with 
a decrease of 7% of always-takers and 9% of never-takers. 

3 The causal model 

Let Yi and Y 2 denote the binary outcomes of interest, let V be a vector of observable covariates, 
let Z be a binary variable equal to 1 when a subject is assigned to the treatment and to 
when he/she is assigned to the control, and let X be the corresponding binary variable for the 
treatment actually received. In the present framework V and Y\ are pre-treatment variables, 
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whereas Y 2 is a post-treatment variable. Moreover, non-compliance of the subjects involved 
in the experimental study implies that X may differ from Z, since we consider experimental 
studies in which subjects randomized to both arms can access the treatment and therefore 
any configuration of (Z, X) may be observed. Consequently, we assume the existence of three 
subpopulations of su bjects enrolled in the study: compilers, never-takers, and always-takers 
( lAngrist et al.l . ll996l ). This rules out the presence of defiers, that is, subjects that systematically 
take the treatment if assigned to the control arm and vice-versa. 

In the following, we introduce a latent variable model for the analysis of data deriving from 



Bartoluccil f l2010h 



the experimental study described above. This model extends that proposed by 
to deal with two-arm experimental studies of the same type in which, however, non-compliance 
may be only observed in the treatment arm. We then derive results about the proposed model 
which are useful for making inference on its parameters. 



3.1 Model assumptions 

We assume that the behavior of a subject depends on the observable covariates V, a latent 
variable U representing the effect of unobservable covariates on both response variables, and 
a latent variable C representing the attitude to comply with the assigned treatment. The 
last one, in particular, is a discrete variable with three possible values: for never-takers, 1 
for compilers, and 2 for always-takers. The model is based on assumptions A1-A5 that are 
reported below. In formulating these assumptions we use the symbol W\ JL W2IW3 to denote 
conditional independence between the random variables W% and W2 given W3; this notation 
naturally extends to random vectors. Moreover, with reference to the variables in our study, 
we also let pi(y\u, v) = pr(Y 1 = y\U = u,V = v) and p 2 (y\u, v, c, x) = pr(F 2 = y\U = u,V = 
v,C = c, X = x), and by 1{-} we denote the indicator function. 
The model assumptions are: 

Al: CJLnKE/.V); 
A2: ZJL(U,Y X ,C)\V; 
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A3: Il^^^KC^) and, with probability 1, X = Z when C = 1 (compliers), X = 
when C = (never-takers), and X = 1 when C = 2 (always-takers); 

A4: Y 2 JL(Y 1 ,Z)\(U,V,C,X); 

A5: for all u, v, c and x, we have 

logit[p 2 (l|«,'U,c,x)] - logit[pi(l|w, v)] = t(c,x)'(3, 



where 



t(c, x) 



and (3 



ft 

ft 



/l{c = 0}(l-x)\ 
l{c= 1}(1 -x) 
l{c= l}x 
V l{c = 2}a; / 

According to assumption Al the tendency to comply depends only on (U,V), whereas 
according to assumption A2 the randomization only depends on the observable covariates in V. 
This assumption is typically satisfied in randomized experiments of our interest and, in any case, 
it may b e relaxed by requ iring that Z is conditionally independent of U given (V, YJ.); this is 



shown in 



Bartoluccil (120101 ) . Assumption A3 is rather obvious considering that C represents the 



tendency of a subject to comply with the assigned treatment. Assumption A4 implies that there 
is no direct effect of Yj on Y 2 , since the distribution of the latter depends only on (U, V, C, X); 
it also implies an assumption known as exclusion restriction, according to which Z affects Y 2 
only through X. Finally, assumption A5 states that the distribution of Y 2 depends on a vector 
of causal parameters (3, the elements of which are interpretable as follows: 

• (3 : effect of control on never-takers; 

• (3\: effect of control on compliers; 

• /3 2 : effect of treatment on compliers; 



fa: effect of treatment on always-takers. 
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The most interesting quantity to estimate is the causal effect of the treatment over the 
control in the subpopulation of compliers. In the present context, this effect may be defined as 



5 = logit[p 2 (l|w, v, 1, 1)] - logit[p 2 (l|tt, v, 1, 0)] =02-01 

and corresponds to the increase of the logit of the probability of success when x goes from to 
1, all the other factors remaining unchanged. 

The above assumptions imply the dependence structure between the observable an d unob- 
servable variables may be represented by the same DAG r e porte d in iBartoluccil (120101 ) . These 
assumptions lead to a causal model in the sense of iPearll (119951 ) since all the observable and 
unobservable factors affecti ng the resp o nse y ariables of interest are included. Moreover, using 



the same approach used in 



Bartoluccil (120101 ). the model may be also formulated in terms of 



potential outcomes, enforcing in this way its causal interpretation. 



3.2 Preliminary results 



Along the same lines as in 



Bartoluccil ( 120101 ). assumptions A1-A5 imply that the probability 



function of the conditional distribution of (Yi, Z, X, Y 2 ) given (U, V, C) is equal to 

p(Vi, z, x, V2\u, v, c) = Pi(yi\u, v)q(z\v)f(x\c, z)p 2 (y 2 \u, v, c, x), 

where q{z\v) = pr(Z = z\ V = v) and f(x\c, z) = pr(X = x\C = c, Z = z). After some algebra, 
for the conditional distribution of (Yi, Z, X, Y 2 ) given (U, V) we have 



e (i/i+ira)A(u,«) 

p(yi, z, x, y 2 \u, v) = - - q(z\v) ^ f{x\c, z)- 

c=0 



e yit{c,x)' @ 



where \(u, v) = logit v)] and tt(c\u, v) = pr(C = c\U = u, V = v). 

This probability function considerably simplifies when i^z. In fact, for z = 1 and x = 0, 
f(x\c,z) is equal to 1 when c = (never-takers) and to otherwise. Similarly, for z = and 
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x = 1, f{x\c, z) is equal to 1 when c = 2 (always-takers) and to otherwise. We then have 

e (yi+y2)Hu,v) e y 2 t(c,x)'p 
p(Vu z, x, y 2 \u, v) = - q(z\v) 



I _|_ e Hu,v) ^ 1 > ]_ _|_ e \(u,v)+t(c,x)' /3 ' 

with 

if z = 1. x = 0, 

2 

2 if^ = 0,x = l. 

Consequently, (Yi,Y 2 ) is conditionally independent of U given (V , Z, X,Y + ) and Z / I. In 
particular, for V + = 1 we have 

e y 2 t(c,x)'f3 

p{Vx, V2\u, v, z, x, 1) = p{y x , y 2 \v, z, x, 1) = - ^y^ ) 

with c defined as in 02]). 

When x = z, the conditional probability p(yi, z, x, y 2 \u, v) has the following expression: 

e {yi+y 2 )\(u,v) e yat(c,0)'P 

p(yi, 0, 0, y 2 | M , «) = 1 + eA(M|t)) g(0|t;) )^ 1 + e A(«,,) +t My/W c K v )' 

c=z 

note that sum Xlc=z * s extended to c = 0, 1 for x = z = and to c = 1, 2 for x = z = 1. 
The latter expression is based on a mixture between the conditional distribution of Y 2 for the 
population of compilers and that of never-takers. 

Finally, consider the conditional distribution of (Yi,Y 2 ) given (U, V, Z, X, Y + ), with Y + = 
Y\ + Y 2 . The probability function of this distribution is denoted by p(yi, 2/2 1«, v, z, x, y+) and is 
equal to 1 for y + = 0, 2, whereas for y + = 1 it may be obtained as 



p(l, z, x, 0\u, v) + p(0, z, x, l\u, v) 

An interesting result deriving from ([1]) is that, when i/z, the latter expression does not 
depend on u and is equal to 

e y 2 t(x,x)'f3 

p(yi,y2\v,z,x,i) 



l _|_ e t[x,x)'P ' 
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On the other hand, (Yi,Y 2 ) is no longer conditionally independent of U, given (V, Z, X,Y + ) 
and X — Z. However, we show below that we can approximate the corresponding conditional 
probability function by a function which is independent of u. This is the basis for the pseudo 
conditional likelihood estimator of (3 and 5 proposed in the next section. 



4 Pseudo conditional likelihood inference 



For a sample of n subjects included in the two-arm experimental study, let yn denote the 
observed value of Y 1 for subject i, % = 1, . . . ,n, let y i2 denote the value of Y 2 for the same 
subject, and let v iy z iy and denote the corresponding values of V, Z, and X, respectively 



In the following, we introduce a n approach 



closely follows that proposed in 



or es timating the causal parameter vector (3 which 



Bartoluccil (12010I ). The approach relies on the maximization of 



a likelihood based on the probability function p(yi, y2\v, z, x, 1), for the cases in which (Yy, Y 2 
is conditionally independent of U given (V, Z,X,Y + ), and on an approximated version 
functi on otherwise. It results a pseudo conditional likelihood estimator, in the sense of 



of 


White 


(2012 


) for 



(119821 ). whose main advantage is the simplicity of use; see also iBartolucci and Nigrol (120121 ) for 
a related approach applied in a different field. Note that this approach requires the preliminary 
estimation of the probability that every subject belongs to one of the three subpopulations 
(compilers, never-takers, and always-takers). Overall, the approach is based on two steps that 
are detailed in the following. 

At the first step we estimate the probabilities that a subject is a never-taker (c = 0), a 
compiler (c = 1), or an always-taker (c = 2). We assume that a multinomial logit with the 
category of compilers as reference category: 



log 
log 



tt(0|u) 
7r(l| v) 
tt(2\v) 

7t(1\v) 



9{v)'ol 0i 



g(V) OL 2 . 



(3) 
(4) 
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This implies that 

7T(0|«) = ,* W [n v (5) 

^^'^ = l + exp[0(u)'a o ]+exp[<7(u)'a 2 ]' (6) 

/oi \ exp[g(v)'o:2] 

7r ( 2 v ) = 7—r : , . , t . 7 

l + exp[flr(v)'ao]+exp[flr(v)'a 2 ] 

Given that the assignment is randomized and does not depend on the individual covariates, 
the parameter vectors cto and ct 2 are estimated by maximizing the log-likelihood 

£ 1 (aco,at 2 ) = y]£u(ac , a 2 ), 

j 

£ u (oL ,a 2 ) = (1 - - Xi) log[7r(0|ui) + 7r(l|«i)] + ^(1 - z i )x i \ogTT(2\v, l ) 

i 

+ y~]zj(l - Xi)logir(0\vi) + ^2z i x i \og[n(l\v i ) +7r(2|«<)]. 

i i 

For this aim, a simple Newton- Raphson algorithm may be used, which is based on the first and 
second derivatives of this function. In particular, the first derivative of this function may be 
found as follows. First of all we write 

„ / x / w x, n(0\vj) + ir(l\vj) , . 7r(2|«i) 



7T 



m 7r(0|«i) , 7r(l|«i) + 7r(2|w i ) 

+ ^ 1 - a*) log + ^ log 1 1 tJ n . ; 1 t; + n log 7T 1 \ Vi 

7v(l\vi) Tr(l\Vi) 

Then, based on the above assumptions fl3j) and PJ, we have 



?ii(a ,a 2 ) = (1 - Zi)(l - Xi)log{l + exp[gf(v i ) / a ]} + (1 - z i )x i g(v i ya 2 
+ Zi(l - Xi)g(vi)'a.Q + ZiXi log{l + exp[g(vi)'a 2 }} 
- log{l + exp[flf(« i )'ao] + exp[t/(t>i)'a 2 ]}, 
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so that 



(^ L (a . ct 2 ) y-r <%i(O!0, OL 2 ) 



i 

d£ 1 (a ,oc 2 ) 



da 



[(1 - Zi)(l - Xi)ir*(Q\Vi) + Zi(l - Xi) - n(0\vi)] gfa), 



and 



d£i (0:0,0:2) \- d£ii(ct , a 2 ) 
da 2 da 2 

d£ u (a ,a 2 ) 



<9o 5 



[z i x i 'K*(2\v i ) + (1 - Zi)Xi - 7r(2|t?j)] gfa 



where 



n \y\ v i) = — TTTi — r~i 771 — V' n ( 2 \ v i 



7 r(0|i; i ) + 7r(l|^)' v 1 11 7r(l|0+7r(2|0' w 
Moreover, regarding the second derivative, we have 

d^ l(a °'" 2) = - zi)(l - ^)tt*(0|^)[1 - «*(0\vi)] - n(0\vi)[l - n(0\ Vl )]} g^g^)' , 

OCZQ0CX,r) 

% 



(9 2 £i(oo,o 2 
da dct' 



{z i x i 7r*(2\v i )[l - 7T*(2|«i)] - tt(2|^)[1 - tt(2|^)]} 



The estimated parameter vectors, obtained by maximizing £i(a Q , a 2 ), are denoted by 6cq and 
ol 2 and the corresponding probabilities are denoted by n(0\v), 7r(l|v), and 7r(2|v), which are 
obtained by (jSJ), (EJ), and (JTj), respectively. Finally, by inversion of minus the Hessian matrix, 
which is based on the second derivatives above, it is also possible to obtain the standard errors 
for the parameter estimates Oq and cx 2 in the usual way. 
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At the second step, we maximize the following weighted conditional log-likelihood: 



£ 2 ({3\6c ,a 2 ) = ^2di£2i(f3\ato,a 2 ), 

i 

< M (/3|ao,a 2 ) = (i-^)(i-x0g7r ol ( C |«0 TTi ^ + (l-^ rr ^ 
n _ \ exp(y i2 /? Q ) , _ v^* ^i„. \ exp(y i2 (3 c+1 ) 



— — - + ZiXi > 7T 12 C «i - 

1 + expA) ^ 1 



exp(/3, 



C+l, 



where dj = + ?/j 2 = 1}, so that only discordant configurations are considered, and 

exp(^ fc ) 
1 + exp{P h ) 

where (3 is the effect of placebo on never-takers, j3i is the effect of placebo on compilers, f3 2 is 
the effect of treatment on compilers, and is the effect of treatment of always-takers. Finally, 
as generalization of OH]), we have that 

Tx{c\Vi) 

TT 01 [c\Vi) = —7^ ; — r , C = 0,1 



n(0\vi) + 7t(l\vi 



and 



^(Cl^i) = - /-.I X , -/o! N > C = 1, 2. 



7r(l|«i) + 7r(2|t>i 

The first is the probability of being a never-taker or a compiler given that the subject is in one of 
these subpopulation and his/her covariates; a similar interpretation holds for the probabilities 
of the second type. 

In order to compute the first and second derivatives of £ 2 ((3\ao, 6l 2 ) with respect to (3, it is 
convenient to express z-th component of this function as 

^iivl&o, 0L2) = Vi2 logO-T7) + (1 - y i2 ) log(l - w[r]), 

where rj = (770,771,^,773)' and the vector of ibi is defined as follows depending on z i: x; t and the 
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estimates from the first step: 



Wi 



(tt^OIO, TToi^K), 0, 0)', if Zi = x t = 0, 

(0,0,0,1)', iizi = 0,Xi = l 

(1,0,0,0)', if * = 1,^ = 

(0, 0, Tf* 12 (l\vi), n* 12 (2\vi))' , if Zi = x t = 1. 



Then we have the following first derivative: 



d£ 2 (/3\at , d 2 ) 



d£ 2i ((3\a , a 2 ) 



5> 



d£ 2i {[3\a ,a 2 ) 



dp 



diag(a 



d/3 

dt 2i {r)\a Q , a 2 ) 



dr] 



where a = diag(f7)(l — 77), with 1 denoting a column vector of ones of suitable dimension. 
Similarly, with 



d 2 e* 2 (r)\ct 0} a 2 
drjdr)' 



5> 



Va 



+ 



1 - Vi2 



w'iV) 2 (1 - w[r}) 2 



WjW-. 



we have that 



d 2 £ 2 {f3\a ,a 2 ) d 2 t 2 {r)\6i 0l 6l 2 ) ( dt 2 (r}\a. Q , 6l 2 ) 

= diag(a) ^diag(a) + diag(o)diag 1 



d(3df3' 



drjdr]' 



drj 



where b = diag(a)(l — 2rf). 

In order to compute standard errors for the parameter estimates, we use a sandwich formula 
for estimating the variance-covariance matrix of the overall estimator = (ckq, cy.[, (3)'. In 
particular, we have 

S = H 1 KH 1 , 



where the matrices H and K are defined in Ap pendix. 



Along the same lines as in 



Bartoluccil (120101 ) we have performed a simulation study about 



the performance of the proposed estimator which we do not show here for reasons of space. 
Our simulation study suggests good finite sample properties of the estimator, also under more 
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general assumptions than those formulated in Section [3J Furthermore, it can be shown that if 
the control has the same effect on never-takers and compilers, and the treatment has the same 
effect on compilers and always-takers, the estimator (3 is consistent as n grows to infinity, in 
symbols /3 A /3, with (3 = (0o, 0±, 02, 0z)' denoting the true parameter vector. 

The result on existence and consistency of the esti mators is not ensured to hold when 
0o ^ 0i and/or 2 ^ 03- However, from the results of IWhitd (119821 ) on the maximum like- 
lihood estimation of misspecified models, it derives that (3 — >■ (3*, where /3„ is the supremum of 
E{£ 2 (/3|o:o*, OL2*)/n}. In the previous expression, a * and ot 2 * denote the limit in probability 
of a and cx 2 , respectively. We therefore expect (3^ to be close to f3 when O is close to 0%, 02 
is close to 03, and ir(c\u, v) weakly depends on u. The same may be said about the estimator 5 
of S, whose limit in probability is denoted by 8* and may be directly computed from f3^. 



5 Application to randomized study on coronary angiog- 
raphy after myocardial infarction 

In this section we describe the application of the proposed estimator to the analysis of the 
data described in Section EJ We recall that the proposed approach is based on two steps: (i) 
estimation of the model for probability of being a never-taker, a compiler, or an always-taker, 
and (m) computation of the approximate conditional logistic estimator. 

Regarding the first step, an important point is the selection of the covariates to explain the 



non-compliance. In particu 
mation Criterion (BIC, see 



ar, we perform ed model choice by minimizing the Bayesian Infor- 



Schwarz 



19781 ). and finally selected two predictors (GI discretized 
using the quartiles and use of statines); see also Section [2j The results from fitting this model 
are reported in Table [2] in terms of estimates of the parameters ao and a.2, which are involved in 
expressions ([3]) and (j4|), and corresponding t-statistics and p- values. For the categorical variable 
identifying the quartile of GI, we used the last quartile as reference category. 

We observe a significant non-compliance. The probabilities of being an always or a never 
taker are related in both cases with the GI and with use of statines. It can be seen that there is 
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Parameter estimates for probability of being never-taker 


Estimator 


Value 


Std. Err. 


t-statistic 


p- value 


aoo (Intercept) 


1.604 


0.531 


3.017 


0.002 


&01 (1st quartile GI) 


-0.757 


0.384 


-1.974 


0.048 


ao2 (2nd quartile GI) 


-0.886 


0.368 


-2.406 


0.016 


«03 (3rd quartile GI) 


-0.437 


0.388 


-1.125 


0.260 


do4 (use of statin) 


-0.985 


0.438 


-2.247 


0.025 




Parameter estimates for probability of being always-taker 


Estimator 


Value 


Std. Err. 


t-statistic 


p- value 


«20 (Intercept) 


1.454 


0.597 


2.436 


0.015 


Q21 (1st quartile GI) 


-0.565 


0.444 


-1.274 


0.202 


a22 (2nd quartile GI) 


-0.862 


0.434 


-1.987 


0.046 


d23 (3rd quartile GI) 


-0.459 


0.449 


-1.023 


0.306 


024 (use of statin) 


-0.980 


0.496 


-1.977 


0.048 



Table 2: Estimates of compliance probability parameters for the proposed model, computed on 
the prompt coronary angiography data; predictors are quartiles of glicemic index ( GI) and use 
of statines. 

a significant lower probability of being a always-taker in the second GI quartile with respect to 
the fourth, while the other two quartiles are not statistically different from the fourth. On the 
other hand, the probability of being a never taker steadily increases with the GI category, with 
the third and fourth quartile not being significantly different. The estimated effects of GI for 
always-takers are explained considering that doctors may choose to assign to prompt CA even 
patients randomized to the control (therefore making them always-takers) with an abnormal GI 
(here, above the median or in the first quartile). Finally, the use of statines increases compliance 
in both directions. This effect can be related to the fact that patients using statines are better 
monitored and maybe already known to doctors, and therefore an higher adherence to the 
experimental settings is easier for these patients. 

Note that, even without covariates, by the proposed method we can obtain an approximately 
unbiased estimator of the causal effect (as seen by comparing 8 with 5^ in Table [3]), but the 
use of covariates allows to take into account part of the heterogeneity, therefore decreasing the 
standard error of this estimate. 

In Table [3] we report estimates of causal parameters, and compare them with four other 
estimators. The first (denoted by 5^) is based on our proposed approach in which no covariates 
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Estimates of the causal parameters 

Estimator Value Std. Err. i-statistic p-value 



2.158 


0.361 


5.973 


< 0.001 


1.948 


0.677 


2.878 


0.004 


-0.072 


0.370 


-0.195 


0.845 


2.252 


0.455 


4.945 


< 0.001 



Estimates of the causal effect for 


compilers 




Estimator 


Value 


Std. Err. 


i-statistic 


p- value 


5 (proposed method) 


-2.020 


0.769 


-2.625 


0.009 


5^ (proposed method) 


-1.938 


0.929 


-2.086 


0.037 


p) 


-0.177 


0.118 


-1.500 


0.133 


#3) 


-0.513 


0.119 


-4.311 


< 0.001 


#4) 


-0.550 


0.149 


-3.691 


< 0.001 



Table 3: Causal parameters for the proposed model estimated on the prompt coronary angiogra- 
phy data. Predictors are GI (discretized in quartiles) and use of statines. In the bottom panel, 
5 is compared with the same estimate when covariates are not used (5^) and with competing 
estimators: 5^ standard conditional estimator based on received treatment (X); 5^ standard 
conditional estimator based on assigned treatment (Z , Intention to Treat analysis); 5^ standard 
conditional estimator based on the assigned and complied treatment (Per Protocol analysis) 

are used to predict compliance. The other three estimators (denoted by 5^ 2 \ 5^, and 5^, 
respectively) are based on conditional logistic regression on the received treatment, an Intention 
to Treat and a Per Protocol analysis. The last two are based on the assigned treatment regardless 
of the actually received treatment, and on patients actually receiving the assigned treatment, 
respectively. From the upper panel we can see that the control has approximately the same 
effect on never takers and on compilers (with a log-odds of about 2). The treatment seems to 
have no effect on compilers, while on always-takers we once again obtain a log-odds of about 
2. We therefore can say that (i) lack of a prompt CA, regardless of whether it was assigned or 
as a result of non-compliance, may increase the risk of recurrence and (it) if a patient who was 
assigned to the control group undergoes prompt CA, this is likely due to a possibly bad (even 
life threatening) condition, hence the high risk of recurrent events even under the treatment. A 
consequence is that bias with ITT and PP estimators arise mostly due to always takers. In fact, 
the effect of the control is approximately the same on never-takers and compilers (/3 ~ Pi)', on 
the other hand, there is a strong difference of the effect of treatment as estimated on compilers 
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and always-takers (/?2 7^ (h)- 

Always-takers in this example can be expected to experience the event even after the treat- 
ment. Ignoring this fact will make the two groups artificially more similar, as testified by the 
estimates 5^, 5^, and 5^. In fact, our most important estimate is 5, which is approximately 
—2. When our final estimate is compared with 5^, 5^, and 5^ we find that those are at 
most only half our causal estimate. The estimate of the causal parameter based on the re- 
ceived treatment (S^) is not even significant. Standard fits in this example may lead to grossly 
underestimate the effect of a prompt CA. 



6 Discussion 

An approach has been introduced to estimate the causal effect of a treatment over control on the 
basis of a two-arm experimental study with possible non-compliance. The approach is applicable 
when the effect of the treatment is measured by a binary response variable observed before and 
after the treatment. It relies on a causal model formulated on the basis of latent variables for 
the effect of unobservable covariates at both occasions and to account for the difference between 
compilers and non-compliers in terms of reaction to control and treatment. The parameters of 
the model are estimated by a pseudo conditional likelihood approach based on an approximated 



version of the conditional probability of the two response variables given t heir sum 



'he causal 



Bartoluccil (1201(1 ) to 



model and the proposed estimator extend the model and the estimator of 
the case in which non-compliance may also happen in the control arm. 

The method is applied to the analysis of data coming from a study on the effect of prompt 
coronary angiography in myocardial infarction. The application shows that prompt coronary 
angiography in patients with myocardial infarction may significantly decrease the risk of other 
events within the next two years, with a log-odds of about -2. On the other hand, estimates of 
this log-odds ratio obtained by the standard logistic approach are considerably closer to 0. 

One of the basic assumptions on which the approach relies is that a subject is assigned to 
the control arm or to the treatment arm with a probability depending only on the observable 
covariates and not on the pre-treatment response variable. Indeed, we could relax this assump- 
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tion, but we would have much more complex expressions for the conditional probability of the 
response variables given their sum. 

As a final comment we remark that we only considered the case of repeated response vari- 
ables having a binary nature. However, the approach may be directly extended to the case 
of response variables having a different nature (e.g. counting), provided that the conditional 
distribution of these variables belongs to the natural exponential family and the causal effect is 
measured on a scale defined ac cording to the canonical link function for the adopted distribution 
(IMcCullagh and Neldeii Il989[ ). 



Appendix: Matrices involved in the sandwich estimator for the vari- 
ance of the estimator 



We have that 



/ 



// 



d£i(d Q ,d 2 ) di 1 {cx 0) OL 2 ) 

doL da' 
d£i(a ,Oi 2 ) 



dct 2 det'r ) 



dct dct' 2 
d£ 1 (ao,a 2 ) 

dOLidOLo 



and 



d£ 2 ((3\d ,d 2 ) d£ 2 (f3\d , d z 

d(3da' 
d£u(d ) x 



O 
O 

d£ 2 ((3\d , d 2 ) 



dpda' 2 



\ 



J 



dat 
d£ii(d 2 ) 
dct 2 
d£ 2i ((3\d ,d 2 ) 
dp 



d(3d(3' 

dinido) d£u(d 2 ) d£ 2i 0\d o ,d 2 ) 



dct' r 



Oat, 



d(3' 



)• 



/ 



In the above expressions, O denotes a matrix of zeros of suitable dimension. Moreover, all the 
derivatives have been defined, with the exception of the derivative of £ 2 (f3\d , d 2 ) with respect 
to olq (or ol 2 ) and (3. In particular, we have that: 



d 2 £*Md ,d 2 ) 
d(3da' c 



diag(a) d, t 



Vi2 



1 - Vi2 \ dw 



w^rj 1 — w^rf J da' c ' 



c=0,2, 



where 



dWj _ \ (^i(0|«0^oi( 1 l w i))-^0l( l v i)^0l( 1 l v i)» »°)'^( v i)» Zi = Xi = \ 
d a o I O, otherwise, 
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and 



da' 



dibi 
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